Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel: mmu: update dependencies for x86 #85327

Conversation

cfriedt
Copy link
Member

@cfriedt cfriedt commented Feb 6, 2025

  • kernel: mmu: update dependencies for x86
  • test: posix: xsi_realtime: do not exclude intel_ish platforms

Currently, some x86 platforms do not support certain MMU operations out-of-the-box.

  • intel_ish_5_4_1
  • intel_ish_5_6_0
  • intel_ish_5_8_0

Previously, linking would fail because arch_mem_map() and arch_mem_unmap() were not implemented on these platforms.

I don't claim to understand all of the nuances of the x86 arch by any means, so this change is mainly intended to fix a
regression by only allowing CONFIG_MMU to be enabled on x86 when CONFIG_X86_MMU is also enabled, which ensures that the arch-specific arch_mem_map() and arch_mem_unmap() symbols are pulled in when possible.

Fixes #85305

@cfriedt
Copy link
Member Author

cfriedt commented Feb 8, 2025

cc @edersondisouza - do you know if there is a way to implement the missing functions for those few platforms?

@nashif nashif assigned dcpleung and unassigned edersondisouza and stephanosio Feb 8, 2025
@nashif
Copy link
Member

nashif commented Feb 8, 2025

@dcpleung can you please take a look.

@@ -4,6 +4,11 @@ common:
ignore_faults: true
integration_platforms:
- mps2/an385
platform_exclude:
# CONFIG_MMU=y but no arch_mem_map() or arch_mem_unmap()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, if this is the only way possible, sure, but I hope we can find a solution without explicit platform excludes...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same - there are a couple of places where we have these excludes, and I would prefer to exclude the excludes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cfriedt looks like all posix tests are failing with this platform

image

see https://github.com/zephyrproject-rtos/zephyr/actions/runs/13221809114

so this exclusion is not the right solution indeed.

Copy link
Member Author

@cfriedt cfriedt Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the same build failure, then I think the simplest workaround would be to implement an arch_mem_map() and arch_mem_unmap() for these platforms that fails -ENOSYS.

The exclusion can still apply to mmap tests, but it would at least allow other applications to link.

@cfriedt
Copy link
Member Author

cfriedt commented Feb 10, 2025

@nashif - I think I've found a better way. Will update this PR shortly. I do not have an intel_ish_5_4_1 device (afaik). Can you or @dcpleung please have a run with it?

Testing done:

# qemu_x86 still works, as expected
west build -p -b qemu_x86 -t run tests/posix/xsi_realtime
...
SUITE PASS - 100.00% [xsi_realtime]: pass = 11, fail = 0, skip = 0, total = 11 duration = 0.131 seconds
 - PASS - [xsi_realtime.test_fs_datasync] duration = 0.002 seconds
 - PASS - [xsi_realtime.test_fs_sync] duration = 0.001 seconds
 - PASS - [xsi_realtime.test_mqueue] duration = 0.013 seconds
 - PASS - [xsi_realtime.test_mqueue_notify_basic] duration = 0.001 seconds
 - PASS - [xsi_realtime.test_mqueue_notify_errors] duration = 0.001 seconds
 - PASS - [xsi_realtime.test_mqueue_notify_non_empty_queue] duration = 0.001 seconds
 - PASS - [xsi_realtime.test_mqueue_notify_thread] duration = 0.106 seconds
 - PASS - [xsi_realtime.test_shm_mmap] duration = 0.003 seconds
 - PASS - [xsi_realtime.test_shm_open] duration = 0.001 seconds
 - PASS - [xsi_realtime.test_shm_read_write] duration = 0.001 seconds
 - PASS - [xsi_realtime.test_shm_unlink] duration = 0.001 seconds

# qemu_x86_64 still works, as expected
west build -p -b qemu_x86_64 -t run tests/posix/xsi_realtime
...
SUITE PASS - 100.00% [xsi_realtime]: pass = 11, fail = 0, skip = 0, total = 11 duration = 0.147 seconds
 - PASS - [xsi_realtime.test_fs_datasync] duration = 0.004 seconds
 - PASS - [xsi_realtime.test_fs_sync] duration = 0.001 seconds
 - PASS - [xsi_realtime.test_mqueue] duration = 0.025 seconds
 - PASS - [xsi_realtime.test_mqueue_notify_basic] duration = 0.001 seconds
 - PASS - [xsi_realtime.test_mqueue_notify_errors] duration = 0.001 seconds
 - PASS - [xsi_realtime.test_mqueue_notify_non_empty_queue] duration = 0.001 seconds
 - PASS - [xsi_realtime.test_mqueue_notify_thread] duration = 0.110 seconds
 - PASS - [xsi_realtime.test_shm_mmap] duration = 0.001 seconds
 - PASS - [xsi_realtime.test_shm_open] duration = 0.001 seconds
 - PASS - [xsi_realtime.test_shm_read_write] duration = 0.001 seconds
 - PASS - [xsi_realtime.test_shm_unlink] duration = 0.001 seconds
 
# intel_ish_5_4_1, links (and should run with the mmap tests skipped)
west build -p auto -b intel_ish_5_4_1 tests/posix/common
...
Generating files from /Users/cfriedt/zephyrproject/zephyr/build/zephyr/zephyr.elf for board: intel_ish_5_4_1
    Packing EC image file for ISH
      kernel binary size: 94972

# intel_ish_541, links and should run as expected
west build -p auto -b intel_ish_5_4_1 tests/lib/c_lib/thrd
...
Generating files from /Users/cfriedt/zephyrproject/zephyr/build/zephyr/zephyr.elf for board: intel_ish_5_4_1
    Packing EC image file for ISH
      kernel binary size: 64380

@cfriedt cfriedt force-pushed the issue/85305/intel-ish-5-8-0-libraries-libc-c11-threads---build-failure branch from c58e25c to 464c1bb Compare February 10, 2025 15:17
Currently, some x86 platforms do not support MMU operations
out-of-the-box.

I don't claim to understand all of the nuances of the x86 arch
by any means, so this change is mainly intended to fix a
regression by only allowing MMU to be selected on x86 when
x86_mmu is also selected. This affects the following platforms.

* intel_ish_5_4_1
* intel_ish_5_6_0
* intel_ish_5_8_0

Previously, linking would fail because `arch_mem_map()` and
`arch_mem_unmap()` were not implemented on these platforms.

Signed-off-by: Chris Friedt <[email protected]>
Previously, intel_ish platforms would fail to link due to
missing `arch_mem_map()` and `arch_mem_unmap()` symbols due
to nuances in the x86 architecture on those platforms.

The previous commit addresses this in kernel/Kconfig.vm
by adding specific x86 conditions for dependencies, so the
platforms no longer need to be filtered, the implementation
of `mmap()` and `munmap()` on those platforms will return -1
and set errno to `ENOTSUP`.

Signed-off-by: Chris Friedt <[email protected]>
@cfriedt cfriedt force-pushed the issue/85305/intel-ish-5-8-0-libraries-libc-c11-threads---build-failure branch from 464c1bb to 0164f55 Compare February 10, 2025 15:18
@zephyrbot zephyrbot added area: POSIX POSIX API Library area: Kernel labels Feb 10, 2025
@cfriedt cfriedt added the area: X86 x86 Architecture (32-bit) label Feb 10, 2025
@cfriedt cfriedt changed the title tests: libc: thrd: exclude selected platforms that have mmap issues kernel: mmu: update dependencies for x86 Feb 10, 2025
@dcpleung
Copy link
Member

Hm?... why is CONFIG_MMU enabled without CONFIG_X86_MMU? CONFIG_X86_MMU selects CONFIG_MMU and not the other way around. Another example is that CONFIG_ARM_MPU selects CONFIG_MPU. Maybe the option enabling CONFIG_MMU should instead be depending on it?

@nashif
Copy link
Member

nashif commented Feb 10, 2025

@cfriedt should posix be implying/selecting MMU in the first place? this is something that needs to be set by the platform, so something like this:

config POSIX_MAPPED_FILES
        bool "POSIX memory-mapped files"
        # disable Xtensa for now until linker issues are resolved
        imply MMU if (CPU_HAS_MMU && !XTENSA)

should depend on MMU and not imply/select it

@cfriedt
Copy link
Member Author

cfriedt commented Feb 10, 2025

Imply / select is correct here.

The reason being that posix option groups are effectively shorthands for manually specifying a lot of dependencies.

Imply is needed here because some architectures / platforms simply cannot support mmap.

@dcpleung
Copy link
Member

dcpleung commented Feb 10, 2025

I still think it is backwards. CONFIG_POSIX_MAPPED_FILES should depend on MMU (and default to y if MMU), instead of enabling MMU (in a sense forcibly). What if there is a board using a MMU capable CPU but the application does not need MMU?

@cfriedt
Copy link
Member Author

cfriedt commented Feb 10, 2025

I still think it is backwards. CONFIG_POSIX_MAPPED_FILES should depend on MMU (and default to y if MMU), instead of enabling MMU (in a sense forcibly). What if there is a board using a MMU capable CPU but the application does not need MMU?

CONFIG_POSIX_MAPPED_FILES should not default to y if MMU is selected. It should default to n (in general), which is how Zephyr has always worked.

The nature of POSIX Option Groups, Options, and Subprofiles is that they should be selected. All POSIX Subprofiles operate by selecting a predefined subset of Subprofiling Option Groups. In that sense, POSIX subprofiles operate as a shorthand.

CONFIG_POSIX_MAPPED_FILES is selected by generally every possible POSIX subprofile since it's required by CONFIG_POSIX_AEP_CHOICE_BASE (i.e. the mandatory system interfaces). Naturally, some architectures and platforms cannot support mapped memory, hence the imply.

There are a couple of issues still in need of attention which will alleviate concerns

  • separate C11 threads from POSIX (aka, make C11 threads use a native Zephyr thread pool)
    • CONFIG_POSIX_AEP_CHOICE_BASE not enabled (which solves the issues above)
    • CONFIG_POSIX_AEP_CHOICE_BASE then becomes roughly the perceptual equivalent of CONFIG_POSIX_API
  • POSIX kconfigs show up although POSIX APIs is not being used #75843, which has a planned fix, roughly outlined below
    • once CONFIG_POSIX_AEP_CHOICE_BASE is not selected by C11 threads, the problems above no longer exist, and we can simplify configuration (and subsequently active POSIX options)
    • probably should enable things like CONFIG_POSIX_C_LANG_SUPPORT_R or CONFIG_POSIX_C_LIB_EXT which Zephyr seems to want by default anyway (e.g. strtok_r(), gmtime_r(), strnlen())
    • the above options can be independently enabled without enabling CONFIG_POSIX_AEP_CHOICE_BASE, since they do not rely on any POSIX functionality

@cfriedt
Copy link
Member Author

cfriedt commented Feb 10, 2025

Realistically, if we want a consistent arch interface, which is the underlying reason for the regression (not really anything to do with POSIX), someone should implement even a trivial or non-functional version of arch_mem_map() and arch_mem_unmap() for the outlying x86 platforms that do not provide it.

That should naturally be something that is done by an x86 arch maintainer.

@nashif
Copy link
Member

nashif commented Feb 11, 2025

Hm?... why is CONFIG_MMU enabled without CONFIG_X86_MMU? CONFIG_X86_MMU selects CONFIG_MMU and not the other way around. Another example is that CONFIG_ARM_MPU selects CONFIG_MPU. Maybe the option enabling CONFIG_MMU should instead be depending on it?

Right.

config X86_MMU

config XTENSA_MMU

same thing for MPU.

Nothing else should be enabling MMU, neither by selection nor by implying.

I still think it is backwards. CONFIG_POSIX_MAPPED_FILES should depend on MMU (and default to y if MMU), instead of enabling MMU (in a sense forcibly). What if there is a board using a MMU capable CPU but the application does not need MMU?

CONFIG_POSIX_MAPPED_FILES should not default to y if MMU is selected. It should default to n (in general), which is how Zephyr has always worked.

The nature of POSIX Option Groups, Options, and Subprofiles is that they should be selected. All POSIX Subprofiles operate by selecting a predefined subset of Subprofiling Option Groups. In that sense, POSIX subprofiles operate as a shorthand.

CONFIG_POSIX_MAPPED_FILES is selected by generally every possible POSIX subprofile since it's required by CONFIG_POSIX_AEP_CHOICE_BASE (i.e. the mandatory system interfaces). Naturally, some architectures and platforms cannot support mapped memory, hence the imply.

There are a couple of issues still in need of attention which will alleviate concerns

  • separate C11 threads from POSIX (aka, make C11 threads use a native Zephyr thread pool)

    • CONFIG_POSIX_AEP_CHOICE_BASE not enabled (which solves the issues above)
    • CONFIG_POSIX_AEP_CHOICE_BASE then becomes roughly the perceptual equivalent of CONFIG_POSIX_API
  • POSIX kconfigs show up although POSIX APIs is not being used #75843, which has a planned fix, roughly outlined below

    • once CONFIG_POSIX_AEP_CHOICE_BASE is not selected by C11 threads, the problems above no longer exist, and we can simplify configuration (and subsequently active POSIX options)
    • probably should enable things like CONFIG_POSIX_C_LANG_SUPPORT_R or CONFIG_POSIX_C_LIB_EXT which Zephyr seems to want by default anyway (e.g. strtok_r(), gmtime_r(), strnlen())
    • the above options can be independently enabled without enabling CONFIG_POSIX_AEP_CHOICE_BASE, since they do not rely on any POSIX functionality

I do not see how any of this is related.

Realistically, if we want a consistent arch interface, which is the underlying reason for the regression (not really anything to do with POSIX), someone should implement even a trivial or non-functional version of arch_mem_map() and arch_mem_unmap() for the outlying x86 platforms that do not provide it.

That should naturally be something that is done by an x86 arch maintainer.

The reality is, MMU was enabled first on X86 and X86 has been the leading architecture for supporting MMU and everything related to MMU, so there is nothing missing and nothing to be implemented. If MMU is enabled (selected) via X86_MMU, then all of those interfaces are available, however, if you select MMU directly in a higher layer and if the platform does not do that for some reason, then things fail.

We are looking to see how to avoid this and resolve this with this particular platform, however, selecting MMU directly is not correct per the current design, see comment by @dcpleung , the maintainer of MMU in Zephyr.

@cfriedt
Copy link
Member Author

cfriedt commented Feb 16, 2025

I do not see how any of this is related.

Maybe you should try to

if you select MMU directly in a higher layer

Imply, when the application enables the feature that requires an mmu, but that was already mentioned, iirc.

and if the platform does not do that for some reason, then things fail.

Hence the workaround. "Some reason", in this case is that MMU support for x86 is somewhat inconsistent.

We are looking to see how to avoid this and resolve this with this particular platform,

Same!

however, selecting MMU directly is not correct per the current design

Implying, if the hardware supports it, and actually, it is correct.

In libraries.libc.c11_threads, POSIX_MAPPED_FILES is enabled (likely due to C11 threads pulling in generic POSIX features, which would ideally not be the case), and MMU support is implied if the hardware supports it (since that is a hardware feature required for POSIX_MAPPED_FILES). It is not enabled on architectures without MMU support.

Ideally, we could have C11 thread support without POSIX. I'll see if I can create a separate PR for that, which would prevent that failure in that testsuite (specifically by not enabling MMU when it is not needed).

However, the other POSIX tests that are failing legitimately have POSIX_MAPPED_FILES enabled because that needed by the spec (unless hardware does not support it). So the failure mode is legitimate and a workaround is needed for x86.

@cfriedt
Copy link
Member Author

cfriedt commented Feb 16, 2025

This PR simply adds missing dependencies for x86 in that the x86 MMU must be available on the x86 architecture for MMU to be enabled.

Seems pretty logical to me.

@nashif
Copy link
Member

nashif commented Feb 16, 2025

if you select MMU directly in a higher layer

Imply, when the application enables the feature that requires an mmu, but that was already mentioned, iirc.

you should neither imply nor select MMU, this is done by the arch specific implementation of MMU. Please stop repeating what was already said multiple times already. MMU shall not be enabled by a software layer, this is a hardware feature that is enabled by the platform .

and if the platform does not do that for some reason, then things fail.

Hence the workaround. "Some reason", in this case is that MMU support for x86 is somewhat inconsistent.

"Some reason" is the fact you can have MMU supported by hardware but you have the choice of not enabling it, this is as simple as that. This has nothing with X86 being inconsistent. I can do the same on ARM and Xtensa for MMUs and the same for MPUs.

however, selecting MMU directly is not correct per the current design

Implying, if the hardware supports it, and actually, it is correct.

That is not something you do at this level, enabling hardware features happens at the platform/arch level. You should not be assuming the platform wants or is able to enable a certain hardware feature because it is supported by the hardware. A platform can selectively enable hardware features, if they are advertised as supported.

In libraries.libc.c11_threads, POSIX_MAPPED_FILES is enabled (likely due to C11 threads pulling in generic POSIX features, which would ideally not be the case), and MMU support is implied if the hardware supports it (since that is a hardware feature required for POSIX_MAPPED_FILES). It is not enabled on architectures without MMU support.

yeah, the thing is, some platforms of arch XYZ will not enable all features supported by arch XYZ, so you should not assume that a feature is supported by the platform if the architecture advertises that as supported.

However, the other POSIX tests that are failing legitimately have POSIX_MAPPED_FILES enabled because that needed by the spec (unless hardware does not support it). So the failure mode is legitimate and a workaround is needed for x86.

No, no workarounds are needed for X86. The issue is not related to X86.

I have shown in the linked bug this can happen on ARM as well

This PR simply adds missing dependencies for x86 in that the x86 MMU must be available on the x86 architecture for MMU to be enabled.

Seems pretty logical to me.

There is nothing logical about this. This is uneeded workaround. Posix should not enable MMU, it should depend on it.

@@ -100,7 +100,7 @@ endif # KERNEL_VM_SUPPORT

menuconfig MMU
bool "MMU features"
depends on CPU_HAS_MMU
depends on CPU_HAS_MMU && (!X86 || (X86 && X86_MMU))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is nothing wrong with X86 here, the same issue applies to Arm and others, MMU should only be enabled by the mmu arch specific implementation and not directly.

@cfriedt
Copy link
Member Author

cfriedt commented Feb 16, 2025

If MMU is only meant to be enabled by the architecture, then why is this a user-configurable option?

Similarly, why do we have e.g. CONFIG_I2C or any hardware or architecture level Kconfig options that users can choose to enable or disable?

Why is it that an application can choose to enable MMU in one case but cannot choose to enable MMU in another case?

If an application chooses to enable the standard POSIX system interfaces, then they typically want the features that would provide them.

The argument against seems quite inconsistent. I'll try to find the arm (aarch64?) bug that was referenced as well...

@nashif
Copy link
Member

nashif commented Feb 16, 2025

If MMU is only meant to be enabled by the architecture, then why is this a user-configurable option?

Even if we make it promptless, the option would still be selectable. There were only 3 ocurrances in the tree with MMU being set in project configs and they were wrong/not needed, probably leftovers from debugging... see #85834

MMU could be made promptless, that could be done as an improvement.

Similarly, why do we have e.g. CONFIG_I2C or any hardware or architecture level Kconfig options that users can choose to enable or disable?

if you select/imply I2C r set CONFIG_I2C in your prj.conf without a supporting driver or implementation, you end up in the same situation as with MMU, no?

i2c driver instances in the tree 'select I2C' etc. It is in many ways similar.

Why is it that an application can choose to enable MMU in one case but cannot choose to enable MMU in another case?

Where do you see applications enabling MMU?

If an application chooses to enable the standard POSIX system interfaces, then they typically want the features that would provide them.

You will need to depend on those features. POSIX should not change how a plaform is setup. If you enable MMU on a platform that was not setup for MMU, things break.

The argument against seems quite inconsistent. I'll try to find the arm (aarch64?) bug that was referenced as well...

It is actually very consistent, nobody in the tree enables MMU config option outside of platforms/arch (and now POSIX).
And as was said initially by @dcpleung

Hm?... why is CONFIG_MMU enabled without CONFIG_X86_MMU? CONFIG_X86_MMU selects CONFIG_MMU and not the other way around. Another example is that CONFIG_ARM_MPU selects CONFIG_MPU. Maybe the option enabling CONFIG_MMU should instead be depending on it?

This is the current state of things, this applies to everything in the tree, not only X86. Enabling MMU without enabling the MMU implementation is not supposed to work. Maybe we need to document this in the MMU kconfig to make it clear.

@cfriedt
Copy link
Member Author

cfriedt commented Feb 16, 2025

Closing in favour of #85837

@cfriedt cfriedt closed this Feb 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: C Library C Standard Library area: Kernel area: POSIX POSIX API Library area: X86 x86 Architecture (32-bit)
Projects
None yet
7 participants